Replica-Aware Job Scheduling in Distributed Systems

نویسندگان

  • Wei-Cheng Liao
  • Jan-Jan Wu
چکیده

This paper proposes an effective replica-aware scheduling algorithm for independent jobs in Grid and distributed systems. The proposed algorithm considers not only the execution time of jobs but also the location and transfer time of data and data replica that these jobs require. We propose a cost model to estimate the starting time and earliest completion time of a job and its associated data (original or replicated). Based on the estimated time, the scheduling algorithm finds a proper execution sequence for the jobs and the data with the goal to minimize the makespan of the jobs. Our experiment results demonstrate that the proposed algorithm is scalable and outperforms a random job selection strategy. We also show that the proposed algorithm performs well compared to a conservative theoretical lower bound, with performance within 15% of the lower bound on average and within 40% in the worst case.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Efficient Data Replication Strategy in Large-Scale Data Grid Environments Based on Availability and Popularity

The data grid technology, which uses the scale of the Internet to solve storage limitation for the huge amount of data, has become one of the hot research topics. Recently, data replication strategies have been widely employed in distributed environment to copy frequently accessed data in suitable sites. The primary purposes are shortening distance of file transmission and achieving files from ...

متن کامل

Green Energy-aware task scheduling using the DVFS technique in Cloud Computing

Nowdays, energy consumption as a critical issue in distributed computing systems with high performance has become so green computing tries to energy consumption, carbon footprint and CO2 emissions in high performance computing systems (HPCs) such as clusters, Grid and Cloud that a large number of parallel. Reducing energy consumption for high end computing can bring various benefits such as red...

متن کامل

A New Job Scheduling in Data Grid Environment Based on Data and Computational Resource Availability

Data Grid is an infrastructure that controls huge amount of data files, and provides intensive computational resources across geographically distributed collaboration. The heterogeneity and geographic dispersion of grid resources and applications place some complex problems such as job scheduling. Most existing scheduling algorithms in Grids only focus on one kind of Grid jobs which can be data...

متن کامل

Combination of data replication and scheduling algorithm for improving data availability in Data Grids

Data Grid is a geographically distributed environment that deals with large-scale data-intensive applications. Effective scheduling in Grid can reduce the amount of data transferred among nodes by submitting a job to a node, where most of the requested data files are available. Data replication is another key optimization technique for reducing access latency and managing large data by storing ...

متن کامل

A JOINT DUTY CYCLE SCHEDULING AND ENERGY AWARE ROUTING APPROACH BASED ON EVOLUTIONARY GAME FOR WIRELESS SENSOR NETWORKS

Network throughput and energy conservation are two conflicting important performance metrics for wireless sensor networks. Since these two objectives are in conflict with each other, it is difficult to achieve them simultaneously. In this paper, a joint duty cycle scheduling and energy aware routing approach is proposed based on evolutionary game theory which is called DREG. Making a trade-off ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010